摘要 :
The real-world use cases of Machine Learning (ML) have exploded over the past few years. However, the current computing infrastructure is insufficient to support all real-world applications and scenarios. Apart from high efficienc...
展开
The real-world use cases of Machine Learning (ML) have exploded over the past few years. However, the current computing infrastructure is insufficient to support all real-world applications and scenarios. Apart from high efficiency requirements, modern ML systems are expected to be highly reliable against hardware failures as well as secure against adversarial and IP stealing attacks. Privacy concerns are also becoming a first-order issue. This article summarizes the main challenges in agile development of efficient, reliable and secure ML systems, and then presents an outline of an agile design methodology to generate efficient, reliable and secure ML systems based on user-defined constraints and objectives.
收起
摘要 :
A major requirement of safety-critical systems is high reliability at low power consumption. Dynamic voltage and frequency (v/f) scaling (DVFS) techniques are widely exploited to reduce power consumption. However, DVFS through dow...
展开
A major requirement of safety-critical systems is high reliability at low power consumption. Dynamic voltage and frequency (v/f) scaling (DVFS) techniques are widely exploited to reduce power consumption. However, DVFS through downscaling v/f levels has a negative impact on the reliability of the tasks running on the cores, and through upscaling v/f levels has circuitlevel aging effects. To achieve high reliability in multicore safetycritical systems, task replication as a fault-tolerant technique is an established way to deal with the negative effect of downscaling v/f levels, but it may accelerate aging effects due to elevating the on-chip temperatures. In this paper, we propose an aging-aware task replication (called ATLAS) method that solves the problem of satisfying the desired reliability target for a set of periodic hard real-time tasks which are executed on a multicore system. The proposed method satisfies the reliability target of the tasks through updating the required number of replicas for each task at different years. We replicate the tasks through our proposed formulas such that the reliability target is satisfied. However, task replication increases the temperature of the system and accelerates aging. To decelerate aging, we attempt to reduce the temperature while mapping and scheduling the tasks. We have also developed a modified demand bound function (DBF) for our aging-aware task replication method to verify scheduling the realtime tasks. Compared to the existing state-of-the-art techniques, experimental results for safety-critical applications on different configurations of multicore systems demonstrate the efficiency and effectiveness of our proposed method. Experiments show that our proposed method improves schedulability on average by 16.1% and reduces the temperature on average by 7.4°C compared to state-of-the-art methods while meeting the system reliability target.
收起
摘要 :
The advancements of deep neural networks (DNNs) have led to their deployment in diverse settings, including safety and security-critical applications. As a result, the characteristics of these models (e.g., the architecture of lay...
展开
The advancements of deep neural networks (DNNs) have led to their deployment in diverse settings, including safety and security-critical applications. As a result, the characteristics of these models (e.g., the architecture of layers and weight values/distributions) have become sensitive intellectual properties that require protection from malicious users. Extracting the architecture of a DNN through leaky side-channels (e.g., memory access) allows adversaries to (i) clone the model (i.e., build proxy models with similar accuracy profiles), and (ii) craft adversarial attacks. DNN obfuscation thwarts side-channel-based architecture stealing (SCAS) attacks by altering the run-time traces of a given DNN while preserving its functionality. In this work, we expose the vulnerability of state-of-the-art DNN obfuscation methods (based on predictable and reversible modifications employed in a given DNN architecture) to these attacks. We present NeuroUnlock, a novel SCAS attack against obfuscated DNNs. Our NeuroUnlock employs a sequence-to-sequence model that learns the obfuscation procedure and automatically reverts it, thereby recovering the original DNN architecture. We demonstrate the effectiveness of NeuroUnlock by recovering the architecture of 200 randomly generated and obfuscated DNNs running on the Nvidia RTX 2080 TI graphics processing unit (GPU). Moreover, NeuroUnlock recovers the architecture of various other obfuscated (and publicly available) DNNs, such as the VGG-11, VGG-13, ResNet-20, and ResNet-32 networks. After recovering the architecture, NeuroUnlock automatically builds a near-equivalent DNN with only a 1.4% drop in the testing accuracy. We further show that launching a subsequent adversarial attack on the recovered DNNs boosts the success rate of the adversarial attack by 51.7% in average compared to launching it on the obfuscated versions. Additionally, we propose a novel methodology for DNN obfuscation, ReDLock, which eradicates the deterministic nature of the obfuscation and achieves 2.16 x more resilience to the NeuroUnlock attack. We release the NeuroUnlock and the ReDLock as open-source frameworks 1 1 https://github.com/Mahya-Ahmadi/NeuroUnlock.
收起
摘要 :
Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, ...
展开
Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, the decisions need to be made fast and in real-time. Moreover, in the quest for electric mobility, this task must follow low power policy, without affecting much the autonomy of the mean of transport or the robot. These two challenges can be tackled using the emerging Spiking Neural Networks (SNNs). When deployed on a specialized neuromorphic hardware, SNNs can achieve high performance with low latency and low power consumption. In this paper, we use an SNN connected to an event-based camera for facing one of the key problems for AD, i.e., the classification between cars and other objects. To consume less power than traditional frame-based cameras, we use a Dynamic Vision Sensor (DVS) [1]. The experiments are made following an offline supervised learning rule, followed by mapping the learnt SNN model on the Intel Loihi Neuromorphic Research Chip [2]. Our best experiment achieves an accuracy on offline implementation of 86%, that drops to 83% when it is ported onto the Loihi Chip. The Neuromorphic Hardware implementation has maximum 0.72 ms of latency for every sample, and consumes only 310 mW. To the best of our knowledge, this work is the first implementation of an event-based car classifier on a Neuromorphic Chip.
收起
摘要 :
The paper addresses some of the opportunities and challenges related to test and reliability of three major emerging computing paradigms; i.e., Quantum Computing, Computing engines based on Deep Neural Networks for AI, and Approxi...
展开
The paper addresses some of the opportunities and challenges related to test and reliability of three major emerging computing paradigms; i.e., Quantum Computing, Computing engines based on Deep Neural Networks for AI, and Approximate Computing (AxC). We present a quantum accelerator showing that it can be done even without the presence of very good qubits. Then, we present Dependability for Artificial Intelligence (AI) oriented Hardware. Indeed, AI applications shown relevant resilience properties to faults, meaning that the testing strongly depends on the application behavior rather than on the hardware structure. We will cover AI hardware design issues due to manufacturing defects, aging faults, and soft errors. Finally, We present the use of AxC to reduce the cost of hardening a digital circuit without impacting its reliability. In other words how to go beyond usual modular redundancy scheme.
收起
摘要 :
Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, ...
展开
Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, the decisions need to be made fast and in real-time. Moreover, in the quest for electric mobility, this task must follow low power policy, without affecting much the autonomy of the mean of transport or the robot. These two challenges can be tackled using the emerging Spiking Neural Networks (SNNs). When deployed on a specialized neuromorphic hardware, SNNs can achieve high performance with low latency and low power consumption. In this paper, we use an SNN connected to an event-based camera for facing one of the key problems for AD, i.e., the classification between cars and other objects. To consume less power than traditional frame-based cameras, we use a Dynamic Vision Sensor (DVS) [1]. The experiments are made following an offline supervised learning rule, followed by mapping the learnt SNN model on the Intel Loihi Neuromorphic Research Chip [2]. Our best experiment achieves an accuracy on offline implementation of 86%, that drops to 83% when it is ported onto the Loihi Chip. The Neuromorphic Hardware implementation has maximum 0.72 ms of latency for every sample, and consumes only 310 mW. To the best of our knowledge, this work is the first implementation of an event-based car classifier on a Neuromorphic Chip.
收起
摘要 :
The paper addresses some of the opportunities and challenges related to test and reliability of three major emerging computing paradigms; i.e., Quantum Computing, Computing engines based on Deep Neural Networks for AI, and Approxi...
展开
The paper addresses some of the opportunities and challenges related to test and reliability of three major emerging computing paradigms; i.e., Quantum Computing, Computing engines based on Deep Neural Networks for AI, and Approximate Computing (AxC). We present a quantum accelerator showing that it can be done even without the presence of very good qubits. Then, we present Dependability for Artificial Intelligence (AI) oriented Hardware. Indeed, AI applications shown relevant resilience properties to faults, meaning that the testing strongly depends on the application behavior rather than on the hardware structure. We will cover AI hardware design issues due to manufacturing defects, aging faults, and soft errors. Finally, We present the use of AxC to reduce the cost of hardening a digital circuit without impacting its reliability. In other words how to go beyond usual modular redundancy scheme.
收起
摘要 :
Gigantic rates of data production in the era of Big Data, Internet of Thing (IoT), and Smart Cyber Physical Systems (CPS) pose incessantly escalating demands for massive data processing, storage, and transmission while continuousl...
展开
Gigantic rates of data production in the era of Big Data, Internet of Thing (IoT), and Smart Cyber Physical Systems (CPS) pose incessantly escalating demands for massive data processing, storage, and transmission while continuously interacting with the physical world using edge sensors and actuators. For IoT systems, there is now a strong trend to move the intelligence from the cloud to the edge or the extreme edge (known as TinyML). Yet, this shift to edge AI systems requires to design powerful machine learning systems under very strict resource constraints. This poses a difficult design task that needs to take the complete system stack from machine learning algorithm, to model optimization and compression, to software implementation, to hardware platform and ML accelerator design into account. This paper discusses the open research challenges to achieve such a holistic Design Space Exploration for a HW/SW Co-design for Edge AI Systems and discusses the current state with three currently developed flows: one design flow for systems with tightly-coupled accelerator architectures based on RISC-V, one approach using loosely-coupled, application-specific accelerators as well as one framework that integrates software and hardware optimization techniques to built efficient Deep Neural Network (DNN) systems.
收起
摘要 :
Gigantic rates of data production in the era of Big Data, Internet of Thing (IoT), and Smart Cyber Physical Systems (CPS) pose incessantly escalating demands for massive data processing, storage, and transmission while continuousl...
展开
Gigantic rates of data production in the era of Big Data, Internet of Thing (IoT), and Smart Cyber Physical Systems (CPS) pose incessantly escalating demands for massive data processing, storage, and transmission while continuously interacting with the physical world using edge sensors and actuators. For IoT systems, there is now a strong trend to move the intelligence from the cloud to the edge or the extreme edge (known as TinyML). Yet, this shift to edge AI systems requires to design powerful machine learning systems under very strict resource constraints. This poses a difficult design task that needs to take the complete system stack from machine learning algorithm, to model optimization and compression, to software implementation, to hardware platform and ML accelerator design into account. This paper discusses the open research challenges to achieve such a holistic Design Space Exploration for a HW/SW Co-design for Edge AI Systems and discusses the current state with three currently developed flows: one design flow for systems with tightly-coupled accelerator architectures based on RISC-V, one approach using loosely-coupled, application-specific accelerators as well as one framework that integrates software and hardware optimization techniques to built efficient Deep Neural Network (DNN) systems.
收起
摘要 :
Feature matching is an important step in many computational photography applications such as image stitching, 3D reconstruction and object recognition. KD-trees based Best Bin First (KD-BBF) search is one of the most widely used f...
展开
Feature matching is an important step in many computational photography applications such as image stitching, 3D reconstruction and object recognition. KD-trees based Best Bin First (KD-BBF) search is one of the most widely used feature matching scheme being employed along with SIFT and SURF. The real time requirements of such computer vision applications for embedded systems put tight compute bounds on the processor. In this paper we propose a soft-core and a hardware accelerator based architecture that enables real time matching of SIFT feature descriptors for HD resolution images at 30 FPS. The proposed accelerator provides a speedup of more than 8 times over the pure software implementation.
收起